QVAC-18608 fix(label-gate): preserve hyphens in input env-var names#1973
Merged
Conversation
The QVAC-18612 canary (PR #1971, run id 25672483584) hard-failed with "required input 'github-token' is missing" even though the workflow clearly passed `github-token: ${{ secrets.GITHUB_TOKEN }}`. Root cause: `getInput` in src/index.mjs was uppercasing the input name AND replacing hyphens with underscores, looking up `INPUT_GITHUB_TOKEN`. The GitHub Actions runner (and @actions/core) preserve hyphens — only spaces are replaced — so the runner sets `INPUT_GITHUB-TOKEN`. The action never found the token and threw a missing-input error. The local smoke test that "passed" before merge set `INPUT_GITHUB_TOKEN=...` (matching the buggy lookup) so both sides were wrong in the same direction. This is exactly the failure mode the canary was meant to surface; without it, the gate would have failed across all 75 secret-bearing workflows on first PR after the QVAC-18612 fan-out. Fix: - getInput now uses `name.replace(/ /g, '_').toUpperCase()` — matching the runner / @actions/core convention exactly. - getInput is exported from src/index.mjs (with an injectable env arg) so the convention can be unit-tested. - Top-level main() is gated on `import.meta.url === argv[1]` so importing index.mjs from tests no longer triggers a real run. Tests: - 9 new tests in test/index.test.mjs pin the env-var-name resolution: * INPUT_GITHUB-TOKEN (hyphen preserved) -> resolves * INPUT_GITHUB_TOKEN (hyphen replaced) -> does NOT resolve (locks the contract against accidental "helpful" rewrite) * spaces are still replaced with underscores * trim, missing-required, defaults-to-process.env - Total: 53/53 pass via `node --test`. - End-to-end smoke against the runner-correct env-var name (INPUT_GITHUB-TOKEN=...) confirms exit 0 and authorised=false on the no-label deny path. Refs: https://app.asana.com/1/45238840754660/project/1214153063536860/task/1214612672233087 Related: #1971 Co-authored-by: Cursor <cursoragent@cursor.com>
darkynt
approved these changes
May 11, 2026
Contributor
Tier-based Approval Status |
sidj-thr
approved these changes
May 11, 2026
tamer-hassan-tether
approved these changes
May 11, 2026
NamelsKing
approved these changes
May 11, 2026
Collaborator
Author
|
/review |
Proletter
added a commit
that referenced
this pull request
May 13, 2026
Re-land of the label-gate fan-out after PR #1997 was reverted on 2026-05-13 (commit 919850c). Re-architected to fix the caller-cap permissions violation that broke 30+ on-pr-* workflows the moment a verified label was applied. Architecture: caller-gates-callee - Reusable workflows (workflow_call invokees) are NOT modified. PR #1997 embedded a label-gate job inside each reusable callee with `pull-requests: write`, which violates the caller-cap rule for any caller that scopes the call to `pull-requests: read|none`. GitHub enforces this at parse time; the affected workflow files won't even load. - Callers get a label-gate job at the top of `jobs:` with `pull-requests: write` (which never crosses a caller-cap boundary). Each `uses:` invocation that targets a secret-bearing reusable, plus every standalone secret-bearing job in the same workflow, gains `needs: [..., label-gate]` and an `if:` prepended with `needs.label-gate.outputs.authorised == 'true'`. - When the gate denies on a `uses:` job, the entire reusable invocation is skipped — the callee runner never starts, no secrets are exposed, and no caller-cap validation can fire because the workflow_call payload is never sent. The label-gate action checks out from the default branch via sparse checkout, which is the same Tanstack-class supply-chain mitigation landed in the canary fix on PR #1971 / #1973. Workflow-by-workflow stats: - 59 caller workflows migrated (label-gate + needs/if updates) - 56 reusable callees, exempt workflows, and no-secret workflows intentionally left UNCHANGED on disk - Pre-existing `authorize-pr` peer jobs preserved (belt-and-suspenders; removal is a follow-up after a soak period) - approval-worker.yml and approval-check-worker.yml exempt (gating them creates a deadlock; we explicitly do not touch them) Pre-flight verification before push: - `python3 .github/scripts/audit-workflow-permissions.py` -> 0 hard violations across 162 caller-callee edges (vs. 21 hard violations after the naive PR #1997-style migration; the audit was added in the previous commit precisely to catch this regression class) - `actionlint .github/workflows/*.{yml,yaml}` reports identical issue counts before and after the migration: 1832 shellcheck (pre-existing), 9 expression (pre-existing), 5 action (down from 7 pre-existing) End-to-end validated in the qvac-internal sandbox with real org teams: - tetherto/qvac-internal#12 (caller-gates-callee + standalone gating against the actual qvac-internal-{dev,merge,release} teams) - Olutest/qvac-tests (public mirror; same harness, single-user allowlist) - Validation matrix: 9/9 scenarios pass, including the strip-on- non-trusted-apply case Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Proletter <40578159+Proletter@users.noreply.github.com>
Proletter
added a commit
that referenced
this pull request
May 13, 2026
Re-land of the label-gate fan-out after PR #1997 was reverted on 2026-05-13 (commit 919850c). Re-architected to fix the caller-cap permissions violation that broke 30+ on-pr-* workflows the moment a verified label was applied. Architecture: caller-gates-callee - Reusable workflows (workflow_call invokees) are NOT modified. PR #1997 embedded a label-gate job inside each reusable callee with `pull-requests: write`, which violates the caller-cap rule for any caller that scopes the call to `pull-requests: read|none`. GitHub enforces this at parse time; the affected workflow files won't even load. - Callers get a label-gate job at the top of `jobs:` with `pull-requests: write` (which never crosses a caller-cap boundary). Each `uses:` invocation that targets a secret-bearing reusable, plus every standalone secret-bearing job in the same workflow, gains `needs: [..., label-gate]` and an `if:` prepended with `needs.label-gate.outputs.authorised == 'true'`. - When the gate denies on a `uses:` job, the entire reusable invocation is skipped — the callee runner never starts, no secrets are exposed, and no caller-cap validation can fire because the workflow_call payload is never sent. The label-gate action checks out from the default branch via sparse checkout, which is the same Tanstack-class supply-chain mitigation landed in the canary fix on PR #1971 / #1973. Workflow-by-workflow stats: - 59 caller workflows migrated (label-gate + needs/if updates) - 56 reusable callees, exempt workflows, and no-secret workflows intentionally left UNCHANGED on disk - Pre-existing `authorize-pr` peer jobs preserved (belt-and-suspenders; removal is a follow-up after a soak period) - approval-worker.yml and approval-check-worker.yml exempt (gating them creates a deadlock; we explicitly do not touch them) Pre-flight verification before push: - `python3 .github/scripts/audit-workflow-permissions.py` -> 0 hard violations across 162 caller-callee edges (vs. 21 hard violations after the naive PR #1997-style migration; the audit was added in the previous commit precisely to catch this regression class) - `actionlint .github/workflows/*.{yml,yaml}` reports identical issue counts before and after the migration: 1832 shellcheck (pre-existing), 9 expression (pre-existing), 5 action (down from 7 pre-existing) End-to-end validated in the qvac-internal sandbox with real org teams: - tetherto/qvac-internal#12 (caller-gates-callee + standalone gating against the actual qvac-internal-{dev,merge,release} teams) - Olutest/qvac-tests (public mirror; same harness, single-user allowlist) - Validation matrix: 9/9 scenarios pass, including the strip-on- non-trusted-apply case Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Proletter <40578159+Proletter@users.noreply.github.com>
Proletter
added a commit
that referenced
this pull request
May 14, 2026
…(re-land) (#2023) * QVAC-18612 infra: gate every secret-bearing workflow with label-gate Re-land of the label-gate fan-out after PR #1997 was reverted on 2026-05-13 (commit 919850c). Re-architected to fix the caller-cap permissions violation that broke 30+ on-pr-* workflows the moment a verified label was applied. Architecture: caller-gates-callee - Reusable workflows (workflow_call invokees) are NOT modified. PR #1997 embedded a label-gate job inside each reusable callee with `pull-requests: write`, which violates the caller-cap rule for any caller that scopes the call to `pull-requests: read|none`. GitHub enforces this at parse time; the affected workflow files won't even load. - Callers get a label-gate job at the top of `jobs:` with `pull-requests: write` (which never crosses a caller-cap boundary). Each `uses:` invocation that targets a secret-bearing reusable, plus every standalone secret-bearing job in the same workflow, gains `needs: [..., label-gate]` and an `if:` prepended with `needs.label-gate.outputs.authorised == 'true'`. - When the gate denies on a `uses:` job, the entire reusable invocation is skipped — the callee runner never starts, no secrets are exposed, and no caller-cap validation can fire because the workflow_call payload is never sent. The label-gate action checks out from the default branch via sparse checkout, which is the same Tanstack-class supply-chain mitigation landed in the canary fix on PR #1971 / #1973. Workflow-by-workflow stats: - 59 caller workflows migrated (label-gate + needs/if updates) - 56 reusable callees, exempt workflows, and no-secret workflows intentionally left UNCHANGED on disk - Pre-existing `authorize-pr` peer jobs preserved (belt-and-suspenders; removal is a follow-up after a soak period) - approval-worker.yml and approval-check-worker.yml exempt (gating them creates a deadlock; we explicitly do not touch them) Pre-flight verification before push: - `python3 .github/scripts/audit-workflow-permissions.py` -> 0 hard violations across 162 caller-callee edges (vs. 21 hard violations after the naive PR #1997-style migration; the audit was added in the previous commit precisely to catch this regression class) - `actionlint .github/workflows/*.{yml,yaml}` reports identical issue counts before and after the migration: 1832 shellcheck (pre-existing), 9 expression (pre-existing), 5 action (down from 7 pre-existing) End-to-end validated in the qvac-internal sandbox with real org teams: - tetherto/qvac-internal#12 (caller-gates-callee + standalone gating against the actual qvac-internal-{dev,merge,release} teams) - Olutest/qvac-tests (public mirror; same harness, single-user allowlist) - Validation matrix: 9/9 scenarios pass, including the strip-on- non-trusted-apply case Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Proletter <40578159+Proletter@users.noreply.github.com> * QVAC-18612 infra: gate on-pr-close-* workflows with label-gate Closes a release-env exposure surfaced when auditing #2023: public-delete-npm-versions.yml (environment: release, packages: write) is invoked by 12 on-pr-close-* workflows, but only embed-llamacpp had label-gate. The other 10 fire on `pull_request: types: [closed]` and reach the release env without authorisation. This is currently held back only by the manual approval on the release environment. Once that approval is dropped (the goal of QVAC-18612), the label-gate becomes the sole control. This commit makes label-gate that control everywhere. Pattern is identical to on-pr-close-embed-llamacpp.yml (already on this branch): inline label-gate job (caller side) + needs/if on the delete-npm-versions-trigger reusable call. Reusable callee (public-delete-npm-versions.yml) is unchanged. on-pr-close-translation-nmtcpp.yml deliberately not modified - it has only workflow_dispatch (no pull_request trigger) and is intrinsically gated by repo-write access.
Proletter
added a commit
that referenced
this pull request
May 24, 2026
…1973) The QVAC-18612 canary (PR #1971, run id 25672483584) hard-failed with "required input 'github-token' is missing" even though the workflow clearly passed `github-token: ${{ secrets.GITHUB_TOKEN }}`. Root cause: `getInput` in src/index.mjs was uppercasing the input name AND replacing hyphens with underscores, looking up `INPUT_GITHUB_TOKEN`. The GitHub Actions runner (and @actions/core) preserve hyphens — only spaces are replaced — so the runner sets `INPUT_GITHUB-TOKEN`. The action never found the token and threw a missing-input error. The local smoke test that "passed" before merge set `INPUT_GITHUB_TOKEN=...` (matching the buggy lookup) so both sides were wrong in the same direction. This is exactly the failure mode the canary was meant to surface; without it, the gate would have failed across all 75 secret-bearing workflows on first PR after the QVAC-18612 fan-out. Fix: - getInput now uses `name.replace(/ /g, '_').toUpperCase()` — matching the runner / @actions/core convention exactly. - getInput is exported from src/index.mjs (with an injectable env arg) so the convention can be unit-tested. - Top-level main() is gated on `import.meta.url === argv[1]` so importing index.mjs from tests no longer triggers a real run. Tests: - 9 new tests in test/index.test.mjs pin the env-var-name resolution: * INPUT_GITHUB-TOKEN (hyphen preserved) -> resolves * INPUT_GITHUB_TOKEN (hyphen replaced) -> does NOT resolve (locks the contract against accidental "helpful" rewrite) * spaces are still replaced with underscores * trim, missing-required, defaults-to-process.env - Total: 53/53 pass via `node --test`. - End-to-end smoke against the runner-correct env-var name (INPUT_GITHUB-TOKEN=...) confirms exit 0 and authorised=false on the no-label deny path. Refs: https://app.asana.com/1/45238840754660/project/1214153063536860/task/1214612672233087 Related: #1971 Co-authored-by: Cursor <cursoragent@cursor.com>
Proletter
added a commit
that referenced
this pull request
May 24, 2026
…(re-land) (#2023) * QVAC-18612 infra: gate every secret-bearing workflow with label-gate Re-land of the label-gate fan-out after PR #1997 was reverted on 2026-05-13 (commit c9b6856). Re-architected to fix the caller-cap permissions violation that broke 30+ on-pr-* workflows the moment a verified label was applied. Architecture: caller-gates-callee - Reusable workflows (workflow_call invokees) are NOT modified. PR #1997 embedded a label-gate job inside each reusable callee with `pull-requests: write`, which violates the caller-cap rule for any caller that scopes the call to `pull-requests: read|none`. GitHub enforces this at parse time; the affected workflow files won't even load. - Callers get a label-gate job at the top of `jobs:` with `pull-requests: write` (which never crosses a caller-cap boundary). Each `uses:` invocation that targets a secret-bearing reusable, plus every standalone secret-bearing job in the same workflow, gains `needs: [..., label-gate]` and an `if:` prepended with `needs.label-gate.outputs.authorised == 'true'`. - When the gate denies on a `uses:` job, the entire reusable invocation is skipped — the callee runner never starts, no secrets are exposed, and no caller-cap validation can fire because the workflow_call payload is never sent. The label-gate action checks out from the default branch via sparse checkout, which is the same Tanstack-class supply-chain mitigation landed in the canary fix on PR #1971 / #1973. Workflow-by-workflow stats: - 59 caller workflows migrated (label-gate + needs/if updates) - 56 reusable callees, exempt workflows, and no-secret workflows intentionally left UNCHANGED on disk - Pre-existing `authorize-pr` peer jobs preserved (belt-and-suspenders; removal is a follow-up after a soak period) - approval-worker.yml and approval-check-worker.yml exempt (gating them creates a deadlock; we explicitly do not touch them) Pre-flight verification before push: - `python3 .github/scripts/audit-workflow-permissions.py` -> 0 hard violations across 162 caller-callee edges (vs. 21 hard violations after the naive PR #1997-style migration; the audit was added in the previous commit precisely to catch this regression class) - `actionlint .github/workflows/*.{yml,yaml}` reports identical issue counts before and after the migration: 1832 shellcheck (pre-existing), 9 expression (pre-existing), 5 action (down from 7 pre-existing) End-to-end validated in the qvac-internal sandbox with real org teams: - tetherto/qvac-internal#12 (caller-gates-callee + standalone gating against the actual qvac-internal-{dev,merge,release} teams) - Olutest/qvac-tests (public mirror; same harness, single-user allowlist) - Validation matrix: 9/9 scenarios pass, including the strip-on- non-trusted-apply case Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Proletter <40578159+Proletter@users.noreply.github.com> * QVAC-18612 infra: gate on-pr-close-* workflows with label-gate Closes a release-env exposure surfaced when auditing #2023: public-delete-npm-versions.yml (environment: release, packages: write) is invoked by 12 on-pr-close-* workflows, but only embed-llamacpp had label-gate. The other 10 fire on `pull_request: types: [closed]` and reach the release env without authorisation. This is currently held back only by the manual approval on the release environment. Once that approval is dropped (the goal of QVAC-18612), the label-gate becomes the sole control. This commit makes label-gate that control everywhere. Pattern is identical to on-pr-close-embed-llamacpp.yml (already on this branch): inline label-gate job (caller side) + needs/if on the delete-npm-versions-trigger reusable call. Reusable callee (public-delete-npm-versions.yml) is unchanged. on-pr-close-translation-nmtcpp.yml deliberately not modified - it has only workflow_dispatch (no pull_request trigger) and is intrinsically gated by repo-write access.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🎯 What problem does this PR solve?
label-gateaction shipped in QVAC-18608 infra: add .github/actions/label-gate (Node 20) #1968 hard-fails on every PR-event invocation withunexpected failure: required input 'github-token' is missing, even when the workflow correctly passesgithub-token: ${{ secrets.GITHUB_TOKEN }}. Caught live by the QVAC-18612 canary in QVAC-18612 infra: repurpose vulkaninfo as label-gate safety canary #1971 (run id 25672483584) before any production workflow could be migrated. Without this fix, the QVAC-18612 fan-out would red-X every PR across 75 workflows on the first run after merge.📝 How does it solve it?
Root cause:
getInput()insrc/index.mjswas doingname.toUpperCase().replace(/-/g, '_'), looking upINPUT_GITHUB_TOKEN. The GitHub Actions runner — and@actions/core— only replace spaces with underscores, not hyphens. The runner setsINPUT_GITHUB-TOKEN(hyphen preserved, technically non-POSIX but Node exposes it viaprocess.envregardless). My implementation never found the value and threw the misleading "missing input" error.Why local smoke tests didn't catch it: my pre-merge smoke set
INPUT_GITHUB_TOKEN=...(matching the buggy lookup). Both sides were wrong in the same direction, so the smoke "passed". Real-CI exposes this immediately — exactly the failure the canary was designed to find.Fix:
getInput()now usesname.replace(/ /g, '_').toUpperCase()— matching the runner /@actions/coreconvention exactly.Testability:
getInput()is now exported (with an injectableenvarg) so the env-var-name resolution can be unit-tested. Top-levelmain()is gated onimport.meta.url === argv[1]so importingindex.mjsfrom tests no longer triggers a real run.🧪 How was it tested?
test/index.test.mjspin the env-var resolution against the runner contract:INPUT_GITHUB-TOKEN(hyphen preserved) → resolves correctlyINPUT_GITHUB_TOKEN(hyphen-replaced, the old bug) → does not resolve (locks the contract against any accidental "helpful" reintroduction of the substitution)@actions/core)process.envpathsnode --test .github/actions/label-gate/test/*.test.mjs→ 53/53 pass (was 44 before, +9 new).authorised=false, notice"'verified' label is not currently applied to PR #1971".authorised=false(gate-job green, downstream skipped) instead of a red gate-job.